Goto

Collaborating Authors

 ct image




Unsupervised Polychromatic Neural Representation for CT Metal Artifact Reduction

Neural Information Processing Systems

Emerging neural reconstruction techniques based on tomography (e.g., NeRF, NeAT, and NeRP) have started showing unique capabilities in medical imaging. In this work, we present a novel Polychromatic neural representation (Polyner) to tackle the challenging problem of CT imaging when metallic implants exist within the human body. CT metal artifacts arise from the drastic variation of metal's attenuation coefficients at various energy levels of the X-ray spectrum, leading to a nonlinear metal effect in CT measurements. Recovering CT images from metal-affected measurements hence poses a complicated nonlinear inverse problem where empirical models adopted in previous metal artifact reduction (MAR) approaches lead to signal loss and strongly aliased reconstructions.


Prompting Lipschitz-constrained network for multiple-in-one sparse-view CT reconstruction

Shi, Baoshun, Jiang, Ke, Lian, Qiusheng, Yu, Xinran, Fu, Huazhu

arXiv.org Artificial Intelligence

Despite significant advancements in deep learning-based sparse-view computed tomography (SVCT) reconstruction algorithms, these methods still encounter two primary limitations: (i) It is challenging to explicitly prove that the prior networks of deep unfolding algorithms satisfy Lipschitz constraints due to their empirically designed nature. (ii) The substantial storage costs of training a separate model for each setting in the case of multiple views hinder practical clinical applications. To address these issues, we elaborate an explicitly provable Lipschitz-constrained network, dubbed LipNet, and integrate an explicit prompt module to provide discriminative knowledge of different sparse sampling settings, enabling the treatment of multiple sparse view configurations within a single model. Furthermore, we develop a storage-saving deep unfolding framework for multiple-in-one SVCT reconstruction, termed PromptCT, which embeds LipNet as its prior network to ensure the convergence of its corresponding iterative algorithm. In simulated and real data experiments, PromptCT outperforms benchmark reconstruction algorithms in multiple-in-one SVCT reconstruction, achieving higher-quality reconstructions with lower storage costs. On the theoretical side, we explicitly demonstrate that LipNet satisfies boundary property, further proving its Lipschitz continuity and subsequently analyzing the convergence of the proposed iterative algorithms. The data and code are publicly available at https://github.com/shibaoshun/PromptCT.


An Anatomy Aware Hybrid Deep Learning Framework for Lung Cancer Tumor Stage Classification

Chowdhury, Saniah Kayenat, Sarmun, Rusab, Chowdhury, Muhammad E. H., Zoghoul, Sohaib Bassam, Al-Hashimi, Israa, Mushtak, Adam, Khandakar, Amith

arXiv.org Artificial Intelligence

Accurate lung cancer tumor staging is crucial for prognosis and treatment planning. However, it remains challenging for end-to-end deep learning approaches, as such approaches often overlook spatial and anatomical information that are central to the tumor-node-metastasis system. The tumor stage depends on multiple quantitative criteria, including the tumor size and its proximity to the nearest anatomical structures, and small variations can alter the staging outcome. We propose a medically grounded hybrid pipeline that performs staging by explicitly measuring the tumor's size and distance properties rather than treating it as a pure image classification task. Our method employs specialized encoder-decoder networks to precisely segment the lung and adjacent anatomy, including the lobes, tumor, mediastinum, and diaphragm. Subsequently, we extract the necessary tumor properties, i.e. measure the largest tumor dimension and calculate the distance between the tumor and neighboring anatomical structures by a quantitative analysis of the segmentation masks. Finally, we apply rule-based tumor staging aligned with the medical guidelines. This novel framework has been evaluated on the Lung-PET-CT-Dx dataset, demonstrating superior performance compared to traditional deep learning models, achieving an overall classification accuracy of 91.36%. We report the per-stage F1-scores of 0.93 (T1), 0.89 (T2), 0.96 (T3), and 0.90 (T4), a critical evaluation aspect often omitted in prior literature. To our knowledge, this is the first study that embeds explicit clinical context into tumor stage classification. Unlike standard convolutional neural networks that operate in an uninterpretable "black box" manner, our method offers both state-of-the-art performance and transparent decision support.


Rethinking Whole-Body CT Image Interpretation: An Abnormality-Centric Approach

Zhao, Ziheng, Dai, Lisong, Zhang, Ya, Wang, Yanfeng, Xie, Weidi

arXiv.org Artificial Intelligence

Automated interpretation of CT images--particularly localizing and describing abnormal findings across multi-plane and whole-body scans--remains a significant challenge in clinical radiology. This work aims to address this challenge through four key contributions: (i) On taxonomy, we collaborate with senior radiologists to propose a comprehensive hierarchical classification system, with 404 representative abnormal findings across all body regions; (ii) On data, we contribute a dataset containing over 14.5K CT images from multiple planes and all human body regions, and meticulously provide grounding annotations for over 19K abnormalities, each linked to the detailed description and cast into the taxonomy; (iii) On model development, we propose OmniAbnorm-CT, which can automatically ground and describe abnormal findings on multi-plane and whole-body CT images based on text queries, while also allowing flexible interaction through visual prompts; (iv) On evaluation, we establish three representative tasks based on real clinical scenarios, and introduce a clinically grounded metric to assess abnormality descriptions. Through extensive experiments, we show that OmniAbnorm-CT can significantly outperform existing methods in both internal and external validations, and across all the tasks.


Fluence Map Prediction with Deep Learning: A Transformer-based Approach

Mgboh, Ujunwa, Sultan, Rafi, Zhu, Dongxiao, Kim, Joshua

arXiv.org Artificial Intelligence

Accurate fluence map prediction is essential in intensity-modulated radiation therapy (IMRT) to maximize tumor coverage while minimizing dose to healthy tissues. Conventional optimization is time-consuming and dependent on planner expertise. This study presents a deep learning framework that accelerates fluence map generation while maintaining clinical quality. An end-to-end 3D Swin-UNETR network was trained to predict nine-beam fluence maps directly from volumetric CT images and anatomical contours using 99 prostate IMRT cases (79 for training and 20 for testing). The transformer-based model employs hierarchical self-attention to capture both local anatomical structures and long-range spatial dependencies. Predicted fluence maps were imported into the Eclipse Treatment Planning System for dose recalculation, and model performance was evaluated using beam-wise fluence correlation, spatial gamma analysis, and dose-volume histogram (DVH) metrics. The proposed model achieved an average R^2 of 0.95 +/- 0.02, MAE of 0.035 +/- 0.008, and gamma passing rate of 85 +/- 10 percent (3 percent / 3 mm) on the test set, with no significant differences observed in DVH parameters between predicted and clinical plans. The Swin-UNETR framework enables fully automated, inverse-free fluence map prediction directly from anatomical inputs, enhancing spatial coherence, accuracy, and efficiency while offering a scalable and consistent solution for automated IMRT plan generation.


Shape Deformation Networks for Automated Aortic Valve Finite Element Meshing from 3D CT Images

Qian, Linchen, Chen, Jiasong, Gong, Ruonan, Sun, Wei, Liu, Minliang, Liang, Liang

arXiv.org Artificial Intelligence

Accurate geometric modeling of the aortic valve from 3D CT images is essential for biomechanical analysis and patient-specific simulations to assess valve health or make a preoperative plan. However, it remains challenging to generate aortic valve meshes with both high-quality and consistency across different patients. Traditional approaches often produce triangular meshes with irregular topologies, which can result in poorly shaped elements and inconsistent correspondence due to inter-patient anatomical variation. In this work, we address these challenges by introducing a template-fitting pipeline with deep neural networks to generate structured quad (i.e., quadrilateral) meshes from 3D CT images to represent aortic valve geometries. By remeshing aortic valves of all patients with a common quad mesh template, we ensure a uniform mesh topology with consistent node-to-node and element-to-element correspondence across patients. This consistency enables us to simplify the learning objective of the deep neural networks, by employing a loss function with only two terms (i.e., a geometry reconstruction term and a smoothness regularization term), which is sufficient to preserve mesh smoothness and element quality. Our experiments demonstrate that the proposed approach produces high-quality aortic valve surface meshes with improved smoothness and shape quality, while requiring fewer explicit regularization terms compared to the traditional methods. These results highlight that using structured quad meshes for the template and neural network training not only ensures mesh correspondence and quality but also simplifies the training process, thus enhancing the effectiveness and efficiency of aortic valve modeling.


A Novel Multi-branch ConvNeXt Architecture for Identifying Subtle Pathological Features in CT Scans

Perera, Irash, Thayasivam, Uthayasanker

arXiv.org Artificial Intelligence

Intelligent analysis of medical imaging plays a crucial role in assisting clinical diagnosis, especially for identifying subtle pathological features. This paper introduces a novel multi-branch ConvNeXt architecture designed specifically for the nuanced challenges of medical image analysis. While applied here to the specific problem of COVID-19 diagnosis, the methodology offers a generalizable framework for classifying a wide range of pathologies from CT scans. The proposed model incorporates a rigorous end-to-end pipeline, from meticulous data preprocessing and augmentation to a disciplined two-phase training strategy that leverages transfer learning effectively. The architecture uniquely integrates features extracted from three parallel branches: Global Average Pooling, Global Max Pooling, and a new Attention-weighted Pooling mechanism. The model was trained and validated on a combined dataset of 2,609 CT slices derived from two distinct datasets. Experimental results demonstrate a superior performance on the validation set, achieving a final ROC-AUC of 0.9937, a validation accuracy of 0.9757, and an F1-score of 0.9825 for COVID-19 cases, outperforming all previously reported models on this dataset. These findings indicate that a modern, multi-branch architecture, coupled with careful data handling, can achieve performance comparable to or exceeding contemporary state-of-the-art models, thereby proving the efficacy of advanced deep learning techniques for robust medical diagnostics.


Poisson Flow Consistency Training

Zhang, Anthony, Gokmen, Mahmut, Hein, Dennis, Ge, Rongjun, Xia, Wenjun, Wang, Ge, Chen, Jin

arXiv.org Artificial Intelligence

The Poisson Flow Consistency Model (PFCM) is a consistency-style model based on the robust Poisson Flow Generative Model++ (PFGM++) which has achieved success in unconditional image generation and CT image denoising. Yet the PFCM can only be trained in distillation which limits the potential of the PFCM in many data modalities. The objective of this research was to create a method to train the PFCM in isolation called Poisson Flow Consistency Training (PFCT). The perturbation kernel was leveraged to remove the pretrained PFGM++, and the sinusoidal discretization schedule and Beta noise distribution were introduced in order to facilitate adaptability and improve sample quality. The model was applied to the task of low dose computed tomography image denoising and improved the low dose image in terms of LPIPS and SSIM. It also displayed similar denoising effectiveness as models like the Consistency Model. PFCT is established as a valid method of training the PFCM from its effectiveness in denoising CT images, showing potential with competitive results to other generative models. Further study is needed in the precise optimization of PFCT and in its applicability to other generative modeling tasks. The framework of PFCT creates more flexibility for the ways in which a PFCM can be created and can be applied to the field of generative modeling.